Disease outbreaks 1996 - 2022

R for Bio Data Science

Group 12

November 28, 2023

Introduction

Disease outbreaks are caused when infectious diseases spread rapidly across the world and they can be caused by bacteria, viruses, fungi and parasites.

Infectious outbreaks cause severe health treats globally such as:

  • COVID-19 2019-2022 (affected 562 million people worldwide)

  • SARS-CoV in 2003, severe acute respiratory syndrome

  • Influenza A (H1N1) in 2009-2010

Environmental factors can influence the disease transmission, these include socioencomic factors as well as region and geographic factors.

Their aim was to create a new dataset of infectious disease outbreaks from the Disease Outbreak News and Coronavirus Dashboard (WHO)

Our aim was to :

  • Reproduce their findings
  • Add other factors such as socioeconomic data

Materials and Methods

Result 1

Final data set handling:

Result 2

Outbreaks over time

  • Diseases with the most outbreaks: COVID-19, Influenza virus, Cholera

  • Frequency of zoonotic diseases has increased the last decade

Yearly Outbreaks Frequency (Reproduced) Yearly Outbreaks colored by Disease Classification

Result 3

Outbreaks frequency and Income status

  • The 20 top diseases -> almost equally distributed in income groups

  • Excluding Covid and influenza -> 66 % in low income or low middle income countries

Fig. 4b Top 20 disease-outbreaks(Reproduced) Fig. 4b Top 20 disease-outbreaks colored by country income status (Novel)

Result 4

Spatial distribution of outbreaks frequency

  • 13 out of the 20 top countries -> Africa

  • 3rd country with the most outbreaks -> USA

Top 20 Countries with highest Outbreak Frequency Outbreak Frequency Map Graph

Result 5

Outbreak Frequency by Continent Line Graph Outbreak Frequency by Income Line Graph

Discussion 1

  • The authors of the article adhere to FAIR guidelines for scientific data management
    • Findability, Accessibility, Interoperability, Reusability

    • The use of standardised naming (ISO-3166 and ICD-10) makes it possible to merge the data with data from other resources.

Strengths

  • Access and tidy data (using tidyverse princibles)

  • Reproduce the plots provided in the article and add novel graphics

  • Added new data on income status, providing further knowledge on outbreaks

Limitations:

  • Did not include new DONs (webscraping) (No obtained permission, packages, time)

  • Current income status - not per year

Discussion 2

Important for the development of targeting strategies

  • Geographical factors and socio-economic factors can influence the spread of the disease and the susceptibility of the population
    • Vaccine consideration before traveling to high-risk geographical regions
  • The data of the study does not reflect the intensity of the disease (cases, deaths)
    • epidemiological data
  • Use for improvement of bio-surveillance: detection and prevention of biological threats

Thank you!